We’ll be using gganimate to do our animating. If you
don’t already have gifski or av as an
installed library, you’ll want to do that (these are what support the
creation of GIF and movie files respectively.
Yet again, we’ll be using the gapminder dataset.
Load the necessary libraries:
library(gapminder)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## Warning: package 'readr' was built under R version 4.0.5
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(gganimate)
Remember that gganimate is built on top of
ggplot, so let’s first get a solid static visualization
that we can use as our base. We’ll use a robust, but completely standard
ggplot call:
p1 <- ggplot(gapminder, aes(gdpPercap, lifeExp, size = pop, color = country)) +
geom_point(alpha = 0.7, show.legend = FALSE) +
scale_color_manual(values = country_colors) +
scale_size(range = c(2, 12)) +
scale_x_log10() +
facet_wrap(~continent) +
theme_bw() +
labs(title = "Year: 1952-2007", x = "GPD per capita", y = "Life Expectancy")
print(p1)
To turn it into an animation, we simply add a few functions:
p2 <- p1 +
labs(title = "Year: {frame_time}", x = "GDP per capita", y = "Life Expectancy") +
transition_time(year) +
ease_aes('linear')
animate(p2)
anim_save("gapminder1.gif")
We can use shadow_wake() to draw a small wake after the
data by showing the latest frames up to the current. You can choose to
gradually diminish the size and/or opacity of the shadow. The length of
the wake is not given in absolute frames, it is given as a proportion of
the total length of the animation.
p3 <- p2 +
shadow_wake(wake_length = 0.3, alpha = FALSE)
animate(p3)
anim_save("gapminder2.gif")
Alternatively we can use shadow_trail() to show the
original data as a trail.
p4 <- p2 +
shadow_trail()
animate(p4)
anim_save("gapminder3.gif")
We’ve created a standard line plot of lifeExp by country, faceted by continent.
p5 <- ggplot(gapminder %>% filter(continent == "Asia"), aes(year, lifeExp, color = country)) +
geom_line(show.legend = FALSE)
p5
We can then call transition_reveal to let the data
gradually appear, by year. The geom_point call means that
as it appears it shows a point.
p6 <- p5 +
geom_point(show.legend = FALSE) +
transition_reveal(year)
animate(p6)
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?
## geom_path: Each group consists of only one observation. Do you need to adjust
## the group aesthetic?
anim_save("gapminder4.gif")
Here we create a bar chart and then add an additional aesthetic
called transition_states that provides a frame variable of
year. For each value of the variable, a step on the chart will be
drawn.
p7 <- gapminder %>%
group_by(year, continent) %>%
summarize(cont_pop = sum(pop)) %>%
ggplot(aes(continent, cont_pop, fill = continent)) +
geom_bar(stat = "identity") +
transition_states(year, transition_length = 2, state_length = 2) +
ease_aes('sine-in-out') +
labs(title = "Population in {closest_state}")
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
animate(p7)
anim_save("gapminder5.gif")
First, we get the data prepped, which includes grouping by year, sort descending by population, assigning the rank, and then filtering to the top 10 for each year.
ranked_by_year <- gapminder %>%
select(country, pop, year, continent) %>%
group_by(year) %>%
arrange(year, -pop) %>%
mutate(rank = min_rank(-pop)) %>%
filter(rank <= 10)
ranked_by_year
## # A tibble: 120 × 5
## # Groups: year [12]
## country pop year continent rank
## <fct> <int> <int> <fct> <int>
## 1 China 556263527 1952 Asia 1
## 2 India 372000000 1952 Asia 2
## 3 United States 157553000 1952 Americas 3
## 4 Japan 86459025 1952 Asia 4
## 5 Indonesia 82052000 1952 Asia 5
## 6 Germany 69145952 1952 Europe 6
## 7 Brazil 56602560 1952 Americas 7
## 8 United Kingdom 50430000 1952 Europe 8
## 9 Italy 47666000 1952 Europe 9
## 10 Bangladesh 46886859 1952 Asia 10
## # … with 110 more rows
Then we create a static plot:
geom_rect which needs the four corners of the
rectangle.geom_text for the country labels.This gives us our static faceted plot.
p8 <- ranked_by_year %>%
ggplot(aes(xmin = 0, xmax = pop / 1000000,
ymin = rank - .45, ymax = rank +.45, y = rank,
fill = continent)) +
geom_rect(alpha = .7) +
facet_wrap(~ year) +
scale_y_reverse() +
scale_x_continuous(limits = c(-800, 1400)) +
geom_text(x = -50,
hjust = "right",
col = "grey",
aes(label = country)) +
labs(x = "Population (millions)",
y = "") +
theme_void()
print(p8)
Then we remove the facet, refine the X scale, add a numeric label
text, set the group aesthetic to country and then set the
transition_time to one year. Then we save as a GIF.
p8 +
facet_null() +
scale_x_continuous(limits = c(-355, 1400)) +
geom_text(x = 1000, y = -9.5,
aes(label = as.character(year)),
size = 30, col = "grey") +
aes(group = country) +
transition_time(year)
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
anim_save("bar_race1.gif")
We begin by reading in the data direct from Git, select only the variables and observations we need. Then clean up the data (changing vars to numeric, renaming them), including pivoting it into a longer dataset. Then we group by year, we rank it, we get some relative values, we format a display label, and then limit it just to the top 10 for any given year. Here’s what that data now looks like:
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
gdp <- read_csv("https://raw.githubusercontent.com/amrrs/animated_bar_charts_in_R/master/data/GDP_Data.csv")
## Rows: 269 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (16): Series Name, Series Code, Country Name, Country Code, 1990 [YR1990...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
gdp <- gdp %>% select(3:15)
gdp <- gdp[1:217,]
gdp_tidy <- gdp %>%
mutate_at(vars(contains("YR")), as.numeric) %>%
pivot_longer(cols = 3:13, names_to = "year") %>%
mutate(year = as.numeric(str_sub(year, 1, 4))) %>%
clean_names()
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion
gdp_formatted <- gdp_tidy %>%
group_by(year) %>%
mutate(rank = rank(-value),
value_rel = value/value[rank==1],
value_label = paste0(" ", round(value/1e9))) %>%
filter(rank <= 10)
gdp_formatted
## # A tibble: 110 × 7
## # Groups: year [11]
## country_name country_code year value rank value_rel value_label
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 Brazil BRA 1990 4.62e11 10 0.0773 " 462"
## 2 Brazil BRA 2000 6.55e11 10 0.0637 " 655"
## 3 Brazil BRA 2009 1.67e12 8 0.116 " 1667"
## 4 Brazil BRA 2010 2.21e12 7 0.148 " 2209"
## 5 Brazil BRA 2011 2.62e12 7 0.169 " 2616"
## 6 Brazil BRA 2012 2.47e12 7 0.153 " 2465"
## 7 Brazil BRA 2013 2.47e12 7 0.148 " 2473"
## 8 Brazil BRA 2014 2.46e12 7 0.141 " 2456"
## 9 Brazil BRA 2015 1.80e12 9 0.0995 " 1802"
## 10 Brazil BRA 2016 1.79e12 9 0.0963 " 1794"
## # … with 100 more rows
From there, we build the plot. A few notes:
geom_tile, which is basically the
same as geom_rect, they just take different arguments.geom_text for the Name label.geom_text for the value label.scales to pretty up the Y scale displayp9 <- ggplot(gdp_formatted, aes(rank, group = country_name,
fill = as.factor(country_name),
color = as.factor(country_name))) +
geom_tile(aes(y = value/2,
height = value,
width = 0.9), alpha = 0.8, color = NA) +
geom_text(aes(y = 0, label = paste(country_name, " ")), vjust = 0.2, hjust = 1) +
geom_text(aes(y = value, label = value_label, hjust = 0)) +
scale_y_continuous(labels = scales::comma) +
scale_x_reverse() +
coord_flip(clip = "off", expand = FALSE) +
theme(axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
legend.position="none",
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
panel.grid.major.x = element_line( size=.1, color="grey" ),
panel.grid.minor.x = element_line( size=.1, color="grey" ),
plot.title=element_text(size=25, hjust=0.5, face="bold", colour="grey", vjust=-1),
plot.subtitle=element_text(size=18, hjust=0.5, face="italic", color="grey"),
plot.caption =element_text(size=8, hjust=0.5, face="italic", color="grey"),
plot.background=element_blank(),
plot.margin = margin(2,2, 2, 4, "cm"))
print(p9)
Now it’s time to animate. In the code below, we set the transition state to cycle through year, take 4 times as long going to the next cut as we do pausing there. We fix the X axis, but allow Y to vary, which is the default behavior. We also set the title label to vary so that it captures the closest state (year). Finally we animate and then save it as a GIF.
p10 <- p9 +
transition_states(year, transition_length = 4, state_length = 1) +
view_follow(fixed_x = TRUE) +
labs(title = "GPD per Year : {closest_state}",
subtitle = "Top 10 Countries",
caption = "GDP in Billions USD | Data Source: World Bank Data")
animate(p10)
anim_save("bar_race2.gif")
Leaflet is a powerful open-source JavaScript library for building interactive maps in HTML.
The architecture is very similar to ggplot2, but instead of putting data-based layers on top of a static map, leaflet allows you to put data-based layers on top of an interactive map.
A leaflet map widget is created with the leaflet()
command. We then add layers to the widget. The first layer that we will
add is a tile layer containing all of the static map information, which
by default comes from OpenStreetMap. The second layer we will add here
is a marker, which designates a point location. Notice how the
addMarkers() function can take a data argument, just like a geom_*()
layer in ggplot2 would.
white_house <- tibble(
address = "The White House, Washington, DC"
) %>%
tidygeocoder::geocode(address, method = "osm")
## Passing 1 address to the Nominatim single address geocoder
## Query completed in: 1 seconds
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.0.5
white_house_map <- leaflet() %>%
addTiles() %>%
addMarkers(data = white_house)
## Assuming "long" and "lat" are longitude and latitude, respectively
white_house_map
You can scroll and zoom at will!
You can also add a pop-up to provide more information about a particular location. Notice how we only need to call the previously saved leaflet map and then add a Popup layer to it.
white_house <- white_house %>%
mutate(title = "The White House",
street_address = "1600 Pennsylvania Ave")
white_house_map %>%
addPopups(data = white_house,
popup = ~paste0("<b>", title, "</b></br>", street_address))
## Assuming "long" and "lat" are longitude and latitude, respectively
There are several different providers of tiles. Below we’ll demonstrate two others, and we’ll also see how we can set a specific view and zoom level.
# Background 1: NASA
leaflet() %>%
addTiles() %>%
setView(lng = 2.34, lat = 48.85, zoom = 5) %>%
addProviderTiles("NASAGIBS.ViirsEarthAtNight2012")
# Background 2: World Imagery
leaflet() %>%
addTiles() %>%
setView(lng = 2.34, lat = 48.85, zoom = 3) %>%
addProviderTiles("Esri.WorldImagery")
We’ll be showing 2016 House election results in NC.
# install.packages("fec16")
library(fec16)
nc_results <- results_house %>% # built in fec16 data
mutate(district = parse_number(district_id)) %>%
left_join(candidates, by = "cand_id") %>% # candidates is also built in fec16 data
select(state, district, cand_name, party, general_votes) %>%
arrange(desc(general_votes)) %>%
filter(state == "NC") %>%
group_by(state, district) %>%
summarize(N = n(),
total_votes = sum(general_votes, na.rm = T),
d_votes = sum(ifelse(party == "DEM", general_votes, 0), na.rm = T),
r_votes = sum(ifelse(party == "REP", general_votes, 0), na.rm = T),
other_votes = total_votes - d_votes - r_votes,
r_prop = r_votes / total_votes,
winner = ifelse(r_votes > d_votes, "Republican", "Democrat"))
## `summarise()` has grouped output by 'state'. You can override using the
## `.groups` argument.
nc_results
## # A tibble: 13 × 9
## # Groups: state [1]
## state district N total_votes d_votes r_votes other_votes r_prop winner
## <chr> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 NC 1 3 350699 240661 101567 8471 0.290 Democrat
## 2 NC 2 8 390567 169082 221485 0 0.567 Republic…
## 3 NC 3 5 323701 106170 217531 0 0.672 Republic…
## 4 NC 4 3 409541 279380 130161 0 0.318 Democrat
## 5 NC 5 5 355512 147887 207625 0 0.584 Republic…
## 6 NC 6 3 351150 143167 207983 0 0.592 Republic…
## 7 NC 7 2 347706 135905 211801 0 0.609 Republic…
## 8 NC 8 3 323045 133182 189863 0 0.588 Republic…
## 9 NC 9 4 332493 139041 193452 0 0.582 Republic…
## 10 NC 10 5 349744 128919 220825 0 0.631 Republic…
## 11 NC 11 3 359508 129103 230405 0 0.641 Republic…
## 12 NC 12 10 349300 234115 115185 0 0.330 Democrat
## 13 NC 13 22 355492 156049 199443 0 0.561 Republic…
Now we need a congressional district shapefile for the 114th
Congress. Remember that the USAboundaries package has CD
files. We also need to load up the sf library so we can
work with sf data.
library(sf)
## Warning: package 'sf' was built under R version 4.0.5
## Linking to GEOS 3.9.1, GDAL 3.4.0, PROJ 8.1.1; sf_use_s2() is TRUE
library(USAboundaries)
nc_map <- us_congressional(resolution = "high", states = "NC")
ggplot(nc_map) +
geom_sf()
We need to merge in the election data with the shape file. The
simplest way to do that is to use an inner_join(). Here we
merge the nc_shp polygons with the nc_results election data frame using
the district as the key.
nc_merged <- nc_map %>%
mutate(district = str_remove(cd116fp, "^0+") %>% as.numeric) %>%
inner_join(nc_results, by = "district")
glimpse(nc_merged)
## Rows: 13
## Columns: 22
## $ statefp <chr> "37", "37", "37", "37", "37", "37", "37", "37", "37"…
## $ cd116fp <chr> "01", "06", "05", "13", "09", "07", "02", "11", "04"…
## $ affgeoid <chr> "5001600US3701", "5001600US3706", "5001600US3705", "…
## $ geoid <chr> "3701", "3706", "3705", "3713", "3709", "3707", "370…
## $ namelsad <chr> "Congressional District 1", "Congressional District …
## $ lsad <chr> "C2", "C2", "C2", "C2", "C2", "C2", "C2", "C2", "C2"…
## $ cdsessn <chr> "116", "116", "116", "116", "116", "116", "116", "11…
## $ aland <dbl> 15207152815, 10128871422, 10280081294, 4745301686, 1…
## $ awater <dbl> 525752701, 209014034, 80701577, 105117478, 85773395,…
## $ state_name <chr> "North Carolina", "North Carolina", "North Carolina"…
## $ state_abbr <chr> "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC"…
## $ jurisdiction_type <chr> "state", "state", "state", "state", "state", "state"…
## $ district <dbl> 1, 6, 5, 13, 9, 7, 2, 11, 4, 10, 8, 3, 12
## $ state <chr> "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC", "NC"…
## $ N <int> 3, 3, 5, 22, 4, 2, 8, 3, 3, 5, 3, 5, 10
## $ total_votes <dbl> 350699, 351150, 355512, 355492, 332493, 347706, 3905…
## $ d_votes <dbl> 240661, 143167, 147887, 156049, 139041, 135905, 1690…
## $ r_votes <dbl> 101567, 207983, 207625, 199443, 193452, 211801, 2214…
## $ other_votes <dbl> 8471, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
## $ r_prop <dbl> 0.2896130, 0.5922910, 0.5840169, 0.5610337, 0.581822…
## $ winner <chr> "Democrat", "Republican", "Republican", "Republican"…
## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((-79.00854 3..., MULTIPOLYGON (((-80.…
We can use Leaflet. First we will define a color palette over the
values [0,1] that ranges from red to blue. According to the
documentation, colorNumeric():
Conveniently maps data values (numeric or factor/character) to colors according to a given palette, which can be provided in a variety of formats.
The domain parameter tells it the possible values that
can be mapped.
pal <- colorNumeric(palette = "RdBu", domain = c(0,1))
To make the plot in Leaflet, we have to add the tiles, and then the polygons defined by the sf object nc_merged. Since we want red to be associated with the proportion of Republican votes, we will map ‘1-r_prop’ to color. Note that we also add popups with the actual proportions, so that if you click on the map, it will show the district number and the proportion of Republican votes.
leaflet_nc <- leaflet(nc_merged) %>%
addTiles() %>%
addPolygons(
weight = 1, fillOpacity = 0.7,
color = ~pal(1 - r_prop),
popup = ~paste("District", district, "</br>", round(r_prop, 4))) %>%
setView(lng = -80, lat = 35, zoom = 7)
leaflet_nc
ggplotly, built and maintained by plotly, allows you to
convert any ggplot visualization into a plotly visualization using the
ggplotly() function.
Below we create a standard static ggplot object.
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
p11 <- gapminder %>%
mutate(logGDPpercap = log(gdpPercap)) %>%
ggplot(aes(lifeExp, logGDPpercap)) +
stat_density2d(geom = 'polygon', aes(fill = ..level..))
print(p11)
All you need to do is pass it the ggplotly() function
and it creates an interactive graphic.
p11 <- ggplotly(p11)
print(p11)
You can also do direct plotly functions, skipping ggplot entirely. This is especially useful when they have a chart format that isn’t easily available in ggplot, such as a stock candlestick chart.
library(tidyquant)
## Loading required package: lubridate
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
## Loading required package: PerformanceAnalytics
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'xts'
## The following object is masked from 'package:leaflet':
##
## addLegend
## The following objects are masked from 'package:dplyr':
##
## first, last
##
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
##
## legend
## Loading required package: quantmod
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
prices <- tq_get("GOOGL")
prices %>%
plot_ly(x = ~date,
type = "candlestick",
open = ~open,
close = ~close,
high = ~high,
low = ~low,
split = ~symbol)